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(54) Apparatus for encoding a video signal using feature point based motion estimation 



(57) A motion-compensated video signal encoder 
has a circuit for determining a predicted current frame 
based on a current frame and a previous frame of a dig- 
ital video signal. The circuit includes a region detection 
circuit for detecting a processing region encompassing 
a moving object from the previous frame based on a dif- 
ference between the current and the previous frames to 
generate region information representing the detected , 
processing region. Therefore, a number of pixels is 
selected from the pixels contained in the detected 



processing region as feature points based on the region 
information. A first set of motion vectors between the 
current and the previous frames, each of the first set of 
motion vectors representing a motion for each of the 
selected pixels is then detected. The first set of motion 
vectors is used for predicting the predicted current 
frame and transmitted as a set of motion vectors of the 
video signal together with the region information. 
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1 EP0 740 

Description 

Field of the Invention 

The present invention relates to an apparatus for s 
encoding a video signal; and, more particularly, to an 
apparatus capable of effectively encoding a video signal 
by using a feature point based motion estimation. 

Description of the Prior Art w 

As is well known, transmission of digitized video 
signals can attain video images of a much higher quality 
than the transmission of analog signals. When an image 
signal comprising a sequence of image "frames" is is 
expressed in a digital form, a substantial amount of data 
is generated for transmission, especially in the case of a 
high definition television system. Since, however, the 
available frequency bandwidth of a conventional trans- 
mission channel is limited, in order to transmit the sub- 20 
stantial amounts of digital data therethrough, it is 
inevitable to compress or reduce the volume of the 
transmission data. Among various video compression 
techniques, the so-called hybrid coding technique, 
which combines temporal and spatial compression 25 
techniques together with a statistical coding technique, 
is known to be most effective. 

Most hybrid coding techniques employ a motion 
compensated DPCM(drfferential pulse coded modula- 
tion), two-dimensional DCT(discrete cosine transform), 30 
quantization of DCT coefficients, and VLC(variable 
length coding). The motion compensated DPCM is a 
process of estimating the movement of an object 
between a current frame and a previous frame, and pre- 
dicting the current frame according to the motion flow of 35 
the object to produce a differential signal representing 
the difference between the current frame and its predic- 
tion. 

The two-dimensional DCT, which reduces or 
removes spatial redundancies between image data 40 
such as motion compensated DPCM data, converts a 
block of digital image data, for example, a block of 8x8 
pixels, into a set of transform coefficient data. This tech- 
nique is described in, e.g.. Chen and Pratt, "Scene 
Adaptive Coder", IEEE Trans actions on Communica- 45 
liQQS, COM-32. No. 3(March 1984). By processing such 
transform coefficient data with a quantizer, zigzag scan- 
ning, and VLC, the amount of data to be transmitted can 
be effectively compressed. 

Specifically, in the motion compensated DPCM, so 
current frame data is predicted from corresponding pre- 
vious frame data based on an estimation of the motion 
between the current and the previous frames. Such esti- 
mated motion may be described in terms of two dimen- 
sional motion vectors representing the displacement of ss 
pixels between the previous and the current frames. 

There have been two basic approaches to estimate 
the displacement of pixels of an object. Generally, they 
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can be classified into two types: one is a block-by-block 
estimation and the other is a pixel-by-pixel approach. 

In the block-by-block motion estimation, a block in a 
current frame is compared with blocks in the previous 
frame until a best match is determined. From this, an 
interframe displacement vector (how much the block of 
pixels has moved between frames) for the whole block 
can be estimated for the current frame being transmit- 
ted. However, in the block-by- block motion estimation, 
blocking effect at the boundary of a block may occur in 
a motion compensation process; and poor estimates 
may result if all pixels in the block do not move in a same 
way, to thereby decrease the overall picture quality. 

Using a pixel-by-pixel approach, on the other hand, 
a displacement is determined for each and every pixel. 
This technique allows a more exact estimation of the 
pixel value and has the ability to easily handle scale 
changes (e.g., zooming, movement perpendicular to the 
image plane). However, in the pixel-by-pixel approach, 
since a motion vector is determined at each and every 
pixel, it is virtually impossible to transmit all of the 
motion vector data to a receiver. 

One of the techniques introduced to ameliorate the 
problem of dealing with the surplus or superfluous 
transmission data resulting from the pixel-by-pixel 
approach is a feature point based motion estimation 
method. 

In the feature point based motion estimation tech- 
nique, motion vectors for a set of selected pixels, i.e., 
feature points, are transmitted to a receiver, wherein the 
feature points are defined as pixels of a previous frame 
or a current frame capable of representing a motion of 
an object so that motion vectors for pixels of a current 
frame can be recovered or approximated from those of 
the feature points in the receiver. In an encoder which 
adopts the motion estimation based on feature points, 
disclosed in a commonly owned copending application, 
U.S. Ser. No. 08/367, 520, entitled "Method and Appara- 
tus for Encoding a Video Signal Using Pixel-by-Pixel 
Motion Estimation", a number of feature points are first 
selected from all of the pixels contained in the previous 
frame using a grid or/and edge detection technique. 
Then, motion vectors for the selected feature points are 
determined, wherein each of the motion vectors repre- 
sents a spatial displacement between one feature point 
in the previous frame and a corresponding matching 
point, i.e., a most similar pixel thereto, in the current 
frame. Specifically, the matching point for each of the 
feature points is searched in a search region within the 
current frame, wherein the search region is defined as a 
region of a predetermined area which encompasses the 
position of the corresponding feature point 

Even though it is possible to greatly reduce the 
amount of data to be transmitted through the use of the 
aforementioned feature point based motion estimation 
technique, a great deal of feature points is still selected 
from not only the moving objects but also stationary 
objects having no motion in case of using the grid 
or/and edge technique. The large number of feature 
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points may require a rather complex circuitry to support 
the above encoding method, or still impose a high level 
of computational burden on, the circuitry for detecting 
the motion vectors therefor. Furthermore, it may be 
required to further reduce the volume of data to be 
transmitted in order to successfully implement a low-bit 
rate codec system having, e.g., 64 kb/s transmission 
channel bandwidth. 

Summary of the invention 

It is, therefore, a primary object of the invention to 
provide an improved video signal encoding apparatus 
for use with a low-bit rate video coding system, which is 
capable of effectively encoding a video signal by esti- 
mating a set of motion vectors selected from a region a 
encompassing a moving object in a video signal through 
the use of feature point based motion estimation. 

In accordance with the present invention, there is 
provided an apparatus, for use in a motion-compen- 
sated video signal encoder, for determining a predicted 
current frame based on a current frame and a previous 
frame of a digital video signal, comprising: region detec- 
tion block for detecting a processing region encompass- 
ing a moving object from the previous frame based on 
the difference between the current and the previous 
frames to generate region information representing the 
detected processing region; feature point selection 
block for selecting a number of pixels from the pixels 
contained in the detected processing region as feature 
points based on the region information; first motion vec- 
tor detection block for detecting a first set of motion vec- 
tors between the current and the previous frames, each 
of the first set of motion vectors representing a motion 
for each of the selected pixels; second motion vector 
detection block for producing a second set of motion 
vectors for all of the pixels contained in the current 
frame by using said first set of motion vectors; and 
motion compensation block for assigning the value of 
each of the pixels in the previous frame, each of the pix- 
els corresponding to one of the pixels in the current 
frame through one of the second set of motion vectors, 
as the value of said one of the pixels in the current 
frame, to thereby determine the predicted current 
frame. 

Brief Description of the Drawings 

The above and other objects and features of the 
present invention will become apparent from the follow- 
ing description of preferred embodiments given in con- 
junction with the accompanying drawings, in which: 

FIG. 1 provides an image signal encoding appara- 
tus having a current frame prediction block in 
accordance with the present invention; 
Fig. 2 shows a detailed block diagram of the current 
frame prediction block of Fig. 1 ; 



Fig. 3 represents two overlapped exemplary frames 
having a moving object; 

Figs. 4A and 4B illustrate two types of grids to 
select feature points; 
s Figs. 5A and 56 describe a technique for selecting 

feature points through the use of grids and edges of 
objects; 

Figs. 6A and 6B explain how the current frame 
motion vector is detected in accordance with the 

io present invention; 

Rg. 7 depicts a detailed block diagram of the 
processing region detection block of Fig. 2; and 
Rg. 8 represents an array of block representative 
values based on the difference between the two 

is frames shown in Fig. 3. 

Detailed Description of the Preferred Embodiments 

Referring to Fig. 1, there is shown an encoding 

20 apparatus for compressing a digital video signal, which 
employs a current frame prediction block 150 in accord- 
ance with the present invention. As shown, current 
frame data is fed as an input digital video signal to a first 
frame memory 100 which stores the input digital video 

25 signal. The input digital video signal is also coupled to 
the current frame prediction block 150 through a line 
L10. Actually, the input digital video signal is read, on a 
block-by-block basis, from the first frame memory 100 
and provided to a subtracter 1 02 through a line L1 1 . The 

30 block size of the input digital video signal typically 
ranges between 8x8 and 32x32 pixels. 

The current frame prediction block 150 of the 
present invention initially serves to determine a set of 
motion vectors for a set of feature points by employing 

35 the feature point based motion estimation which will be 
described hereinafter with reference to Figs. 2 and 3'. 1 
wherein the feature points are selected in a processing 
region of a reconstructed previous frame. After deter- 
mining the motion vectors for the whole feature points 

40 through the use of a current frame signal on the line L1 0 
retrieved from the first frame memory 100 and a previ- 
ous frame signal on a line L12 from a second frame 
memory 124, the motion vectors are used for predicting 
the current frame on a pixel -by-pixel basis in order to 

45 generate a predicted current frame signal onto a line 
L30. The motion vectors and region information repre- 
senting the region location are also coupled through a 
line L20 to an entropy coder 107. 

The predicted current frame signal on the line L30 

so is subtracted from a current frame signal on the line L1 1 
at the subtracter 102, and the resultant data, i.e., an 
error signal denoting differential pixel values, is dis- 
patched to an image signal encoder 105. At the image 
signal encoder 105, the error signal is encoded into a 

55 set of quantized transform coefficients, e.g., by using a 
DCT and any of the known quantization methods. 
Thereafter, the quantized transform coefficients are 
transmitted to an entropy coder 107 and an image sig- 
nal decoder 113. 
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At the entropy coder 107, the quantized transform 
coefficients from the image signal encoder 105, the 
motion vectors and the region information are coded 
together by using, e.g., a known variable length coding 
technique; and transmitted to a transmitter(not shown) 
for the transmission thereof. In the meantime, the image 
signal decoder 113 converts the quantized transform 
coefficients from the image signal encoder 105 back 
into a reconstructed error signal by employing an 
inverse quantization and an inverse discrete cosine 
transform. The reconstructed error signal from the 
image signal decoder 113 and the predicted current 
frame signal on the line L30 from the current frame pre- 
diction block 150 are combined at an adder 115 to 
thereby provide a reconstructed frame signal to be 
stored as a previous frame signal in the second frame 
memory 1 24. 

Referring to Fig. 2, there are illustrated details of 
the current frame prediction block 150 shown in Fig. 1. 
The current frame prediction block 150 is provided with 
a processing region detection block 210, a feature point 
selection block 230, a feature point motion vector 
search block 240, a cunent frame motion vector detec- 
tion block 250 and a motion compensation block 260. 

The current frame signal on the line L10 from the 
first frame memory 100 and the previous frame signal 
on the line LI 2 from the second frame memory 1 24 are 
inputted to the processing region detection block 210. 
The processing region detection block 210 serves to 
detect the processing region having a moving object by 
using the difference between the current frame signal 
and the previous frame signal. Referring to Fig. 3, there 
are shown two overlapped exemplary frames 300 (i.e., 
the current and the previous frames), each frame having 
5x7 image blocks. Each of the image blocks includes a 
plurality of pixels, e.g., 16 x1 6 pixels. Assuming that the 
overlapped frames include four objects, e.g., three sta- 
tionary objects 310 to 330 and a moving object 340, and 
the difference between two frames appears in a region 
350 which includes a plurality of image blocks encom- 
passing the moving object 340, the region 350 will be 
referred as the processing region. After detecting the 
processing region 350. the processing region detection 
block 21 0 generates region information representing the 
selected processing region based on the location data 
of the image blocks encompassing or encircling the 
moving object 340. The region information is then cou- 
pled to the feature point selection block 230, the current 
frame motion vector detection block 250 and the entropy 
coder 107. 

At the feature point selection block 230, a number 
of feature points are selected among the pixels con- 
tained in the processing region of the previous frame 
signal through the use of the region information from the 
processing region detection block 210 and the previous 
frame signal from the second frame memory 124. The 
feature points are defined as the pixels which are capa- 
ble of representing the motion of an object in the frame. 
In a preferred embodiment of the present invention, the 



feature points are determined by a known grid tech- 
' nique employing one various types of grid, e.g., a rec- 
tangular grid or an overlapped hexagonal grid shown in 
Figs. 4A and 4B, respectively. As shown in Figs. 4A and 

5 4B, the feature points are located at the nodes of the 
grid. In another preferred embodiment of the invention, 
an edge detection technique is employed together with 
the above described grid technique as shown in Figs. 
5A and 5B. In this scheme, intersection points of the 

io grid and edge of the object are selected as feature 
points. The selected feature points from the feature 
point selection block 230 are inputted to the feature 
point motion vector search block 240. The feature point 
selection block 230 also serves to generate position 

15 data for the feature points based on the region informa- 
tion, which is also coupled to the feature point motion 
vector search block 240 and the current frame motion 
vector detection block 250. 

At the feature point motion vector search block 240, 

20 a first set of motion vectors for the selected feature 
points is detected based on the current frame signal on 
the line 10 and the previous frame signal on the line 12. 
Each of the motion vectors in the first set represents a 
spatial displacement between a feature point in the pre- 

25 vious frame and a most similar pixel thereto in the cur- 
rent frame. There are numerous processing algorithms 
available for use to detect the motion vectors on a pixel - 
by-pixel basis. In the preferred embodiments of the 
invention, there is used a block matching algorithm: that 

30 is, when a feature point block of the previous frame, hav- 
ing a feature point at the centre thereof, is retrieved via 
the line L12 from the second frame memory 124 shown 
in Fig. 1 . Thereafter, a motion vector for the feature point 
block is determined after a similarity calculation by 

35 using an error function, e.g., MAEfmean absolute error) 
or MSE(mean square error), between the feature point 
block and each of a plurality of equal-sized candidate 
blocks included in a generally larger search region of P 
x Q, e.g., 10 x 10, pixels of the current frame retrieved 

40 from the first frame memory 100 shown in Fig. 1, 
wherein the motion vector is a displacement between 
the feature point blocks and a candidate block which 
yields a minimum error function. The determined motion 
vector is then set as the motion vector of the feature 

45 point. The motion vectors for the feature points are 
applied, as the first set of motion vectors, to the current 
frame motion vector detection block 250 and the entropy 
coder 107 shown in Fig. 1 through the line 20. 

At the current frame motion vector detection block 

so 250, a second set of motion vectors for all of the pixels 
contained in the current frame is determined through 
the use of the first set of motion vectors from the feature 
point motion vector search block 240, the position data 
of the feature points from the feature point selection 

55 block 230 and the region information from the process- 
ing region detection block 210. In accordance with the 
preferred embodiment of the present invention, the sec- 
ond set of motion vectors are determined by using a 
known affine transform. In order to determine the sec- 
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ond set of motion vectors, quasi-feature points ,are 
determined first, wherein the quasi-feature points repre- 
sent the pixels of the current frame shifted from the fea- 
ture points contained in the processing region of the 
previous frame by the first set of motion vectors. After 
determining the quasi-feature poirrts(QP's), a plurality of 
non-overlapping polygons, e.g.. triangle, are defined by 
connecting, e.g., three neighboring quasi-feature points 
as shown in Fig. 6A. 

In the preferred embodiment of the invention, for- 
mation of unique triangles from a set of arbitrarily dis- 
tributed QP's is obtained by adding a new line segment 
between one of QP's and its nearest QP, starting from a 
QP of the highest priority. For instance, if seven QP's, 
e.g., QP1 to QP7, are randomly distributed in a rectan- 
gular processing region of a current frame of 6 x 5 pixels 
as shown in Fig. 6A, formation of line segments for the 
QP's is performed in a sequence of QP1 to QP7, 
wherein the numerals in the parentheses represent x 
and y coordinates of a QP measured from the origin, 
e.g., left-top corner pixel P1 , of the rectangular process- 
ing region. That is. priority is given to the QP's in an 
ascending order of their y values. If more than one QP 
has a same y value, priority will be given to the QP's in 
an ascending order of their x values. 

Specifically, for the set of QP's illustrated in Fig. 6A, 
a line segment QP1QP4 is selected for QP1 first, fol- 
lowed by a line segment QP2QP3 for QP2. QP3QP4 is 
determined as a line segment for QP3 because 
QP2QP3 has been already selected. The QP of a next 
priority, i.e., QP4, ha s two nea rest QP's, i.e., QP5 and 
QP6. In such a case, QP4QP5 is selected because QP5 
has a higher priority. Similarly, line segments QP5QP6, 
QP6QP4 and QP7QP3 are determined for QP5, QP6 
and QP7 in sequence. These processes are repeated 
until all the line segments are found with the condition 
that a newly added line segment may not overlap or 
intersect any of the previously selected line segments. 

Thereafter, the second set of the motion vectors is 
calculated by using an affine transformation technique. 
As well known in the art, an arbitrary sequence of rota- 
tion, translation and scale changes of a moving object 
can be represented by the affine transformation. 

Assuming, as shown in Fig. 6B, that three pixels A, 
B and C in the current frame are determined as quasi- 
feature points corresponding to their respective feature 
points A', B* and C in the previous frame, pixels in a tri- 
angle ABC of the current frame may be correlated to 
those in the triangle A'B'C of the previous frame by the 
affine transformation defined as: 



Eq.O) 



wherein (x, y) represents the x and y coordinates of a 
pixel within the current frame and (x\ y*), the coordi- 
nates of a predicted position in the previous frame; and 
a to f are affine transform coefficients. 
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Those six affine transform coefficients are calcu- 
lated by solving six linear equations obtained from three 
sets of related feature and quasi-feature points, i.e., A'- 
A, B'-B and C'-C. Once the affine transform coefficients 
are known, each of the remaining pixels in the triangle 
ABC can be mapped onto a position in the triangle 
A'B'C through the use of Eq. 1 . In this manner, pixels in 
each triangle within the current frame can be predicted 
from those of the previous frame. Pixels on the bound- 
ary of two contiguous triangles, e.g., P1 shown in Fig. 
6A, can be predicted from either one of the two trian- 



Thereafter, a motion vector for each of the pixels, 
P(x, y), in the current frame is determined from a dis- 
placement between the pixel P and its prediction P'(x', 
/) as: 



M y =y'-y 



Eq.(2) 



wherein M x and M y are the x and y components of the 
motion vector for the pixel P, respectively. 

In the preferred embodiment of the invention, 
motion vectors for the pixels lying outside the process- 
ing region in the current frame are set to zeros. 

Referring back to Fig. 2. provided from the current 
frame motion vector detection block 250 to the motion 
compensation block 260 is the second set of motion 
vectors for the pixels of the current frame. 

The motion compensation block 260 retrieves each 
value of the pixels to be contained in a predicted current 
frame from the second frame memory 124 shown in Fig. 
1 by using each of the motion vectors contained in the 
second set, thereby providing the predicted current 
frame signal to the subtracter 102 and the adder 115 
shown in Fig. 1 via the line L30. In case both compo- 
nents of a motion vector, i.e., M x and M y are not inte- 
gers, the predicted pixel value can be obtained by 
interpolating pixel values of the pixels neighboring the 
position designated by the motion vector. 

In another preferred embodiment of the invention, 
the predicted positions, which are obtained from Eq. 1 , 
can be directly provided from the current frame motion 
vector detection block 250 to the motion compensation 
block 260 without resorting to the second set of motion 
vectors. Predicted positions for the pixels residing out- 
side the processing region in the current frame are set 
to have identical positions to those of respective pixels 
lying outside the processing region of the previous 
frame. The motion compensation block 260 then 
retrieves pixel vaJues, which correspond to the predicted 
positions, from the second frame memory 124, thereby 
providing the predicted current frame signal onto the 
line L30. 

Referring to Fig. 7, there is illustrated a block dia- 
gram, in accordance with the present invention, of the 
processing region detection block 210 depicted in Fig. 
2. Both of the current frame signal from the first frame 
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memory 100 and the previous frame signal from the 
second frame memory 124 are fed to a subtracter 700. 

The subtracter 700 serves to calculate the differ- 
ence (A shown in Fig. 3) between the pixel values of the 
current frame signal and the corresponding pixel values s 
of the previous frame signal on a pixel-by-pixel basis. 
The resultant data, i.e., a frame difference signal, from 
the subtracter 700 is dispatched to an absolutizing cir- 
cuit 710. The absolutizing circuit 710 converts each 
pixel difference value included in the frame difference w 
signal into its absolute value. Thereafter, an absolutized 
frame difference signal is applied to a first comparator 
720 which compares each absolutized pixel difference 
value of the absolutized frame difference signal with a 
predetermined threshold value TH1. Thereafter, the 15 
absolutized frame difference signal is converted to have 
two types of data, e.g., "1" or "0". That is, if each abso- 
lutized pixel difference value is smaller than TH1, the 
absolutized pixel difference value is set to "0". Other- 
wise, the absolutized pixel difference value is set to "1 ". 20 
The converted frame difference signal from the first 
comparator 720 is fed to a terminal 1 of a switch SW1 
on a block-by-block basis. 

At the switch SW1 , the terminal 1 is connected to a 
terminal 3 which links the first comparator 720 to a third 25 
frame memory 730 until the third frame memory 730 is 
filled with the converted frame difference signal com- 
pletely, wherein the switch SW1 is controlled by a con- 
trol signal CS1 from a controller(not shown). If the third 
frame memory 730 is filled, the terminal 1 of the switch 30 
SW1 is separated from the terminal 3 and a terminal 2 
is then coupled to the terminal 3. In the meanwhile, a 
control signal CS2 is applied to a switch SW2 and a ter- 
minal 4 of the switch SW2 is connected to a terminal 6 
which links the third frame memory 730 to a noise 35 
reduction unit 740. The converted frame difference sig- 
nal stored in the third frame memory 730 is fed to the 
noise reduction unit 740 through the terminals 4 and 6. 

At the noise reduction unit 740, spot noises which 
may be contained in the converted frame difference sig- 40 
nal are detected and removed. The spot noises may be 
easily detected by comparing the difference value of 
pixels with an average or median difference value of the 
neighboring pixels. However, since each pixel difference 
value of the converted frame difference signal is repre- 45 
sented by the two types of data, i.e., "0" and "1 for the 
detection of the spot noises, the noise reduction unit 
740 employs a known windowing method. That is, the 
noise reduction unit 740 counts the pixel difference val- 
ues of "1" within a suitably chosen window, which so 
includes the target pixel difference value of "1" to be 
processed at the center thereof. And then the count 
number is compared with a predetermined number 
TH2. If the count number is smaller than the predeter- 
mined number TH2, the target pixel difference value is ss 
confirmed as the spot noise and "0" is assigned to the 
target pixel difference value by using an updating oper- 
ation via the terminals 2 and 3, Otherwise, the target 
pixel difference value will not be changed. 



When the target pixel difference values are 
'updated', the controller(not shown) generates the con- 
trol signal CS2 for controlling thg switch SW2. That is, at 
the switch SW2, the terminal 4 is connected to a termi- 
nal 5 and the noise-reduced frame difference signal 
from the third frame memory 730 is transmitted to a 
counter 750. 

At the counter 750, the number of pixel difference 
values of "1" contained in each image block is counted. 
Thereafter, the counted number of pixel difference val- 
ues of "1" contained in each image block is sequentially 
fed to a second comparator 760. At the second compa- 
rator 760, the counted number of pixel difference value 
of "1" is compared with a predetermined number TH3. If 
the counted number is greater than the predetermined 
number TH3, "1" is assigned as a block representative 
value to the image block. Otherwise, "0" is assigned as 
the block representative value to the image block. 

Referring to Fig. 8, there is demonstrated an array 
of block representative values based on the difference 
between two frames shown in Fig. 3. As shown, "1" is 
assigned to the image blocks contained in the process- 
ing region and "0" is assigned to the other blocks lying 
outside the processing block. The block representative 
values of the array are scanned sequentially as the 
region information which is fed to the feature point 
selection block 230 shown in Fig. 2 and the entropy 
coder 107 shown in Fig. 1. 

As may be seen from the above, since the motion 
vector detector initially detects a processing region 
encompassing a moving object by using the difference 
between the current frame and the previous frame, the 
inventive encoder can obtain the motion vectors for a 
limited number of feature points selected from the 
processing region to thereby greatly reduce the compu- 
tational burden and amount of motion vectors to be 
transmitted, thereby improving the coding efficiency. 

While the present invention has been shown and 
described with respect to the particular embodiments, it 
will be apparent to those skilled in the art that many 
changes and modifications may be made without 
departing from the spirit and scope of the invention as 
defined in the appended claims. 

Claims 

1. An apparatus, for use in a motion-compensated 
video signal encoder, for determining a predicted 
current frame based on a current frame and a pre- 
vious frame of a digital video signal, comprising: 

region detection means for detecting a 
processing region encompassing a moving 
object from the previous frame based on a dif- 
ference between the current and the previous 
frames to generate region information repre- 
senting the detected processing region; 
feature point selection means for selecting a 
number of pixels from the pixels contained in 
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the detected processing region as feature 
points based on the region information; 
first motion vector detection means for detect- 
ing a first set of motion vectors between the 
current and the previous frames, each of the 
first set of motion vectors representing a 
motion for each of the feature points; 
second motion vector detection means for pro- 
ducing a second set of motion vectors for all of 
the pixels contained in the current frame by 
using the first set of motion vectors; and 
motion compensation means for assigning the 
value of each of the pixels in the previous 
frame, said each of the pixels corresponding to 
one of the pixels in the current frame through 
one of the second set of motion vectors, as the 
value of said one of the pixels in the current 
frame, to thereby determine the predicted cur- 
rent frame. 

2. The apparatus as recited in claim 1 , wherein said 
region detection means includes: 

means for calculating the difference between 
the current and the previous frames on a pixel- 
by-pixel basis to generate a frame difference 
signal wherein the frame difference signal 
includes NxM blocks, each block having PxQ 
pixel difference values and N, M, P and Q are 
positive integers; 

means for absolutizing the frame difference 
signal to generate an absolutized frame differ- 
ence signal; 

means for comparing the absolutized frame dif- 
ference signal with a predetermined value to 
convert the absolutized frame difference signal 
into a converted frame difference signal, 
wherein, when a pixel difference value of the 
absolutized frame difference signal is smaller 
than the predetermined value, "0" is assigned 
as the pixel difference value and, otherwise, "1 " 
is assigned as the pixel difference value; 
means for counting the number of "1"'s con- 
tained in each block of the converted frame dif- 
ference signal; 

means for comparing the counted number for 
each block with a predetermined number to 
generate the region information having 
sequentially arranged NxM block representa- 
tive values, wherein, when the counted number 
for a block is smaller than the predetermined 
value, "0" is assigned as a block representative 
value of the region information and, otherwise, 
"1" is assigned as the block representative 
value. 

3. The apparatus as recited in claim 2, wherein the 
second set of motion vectors includes a third set of 
motion vectors for all of the pixels contained in the 



processing region of the current frame, which is 
obtained based on the first set of motion vectors, 
and a fourth set of motion vector for the pixels lying 
outside the processing region of the current frame, 
5 which is made to consist of zero vectors. 

4. An encoding apparatus, for use in a video signal 
encoder, for reducing a transmission rate of a digital 
video signal, said digital video signal having a plu- 
io rality of frames including a current frame and a pre- 
vious frame, comprising: 

memory means for storing the previous frame 
of the digital video signal; 

is region detection means for detecting a 

processing region encompassing a moving 
object from the previous frame based on a dif- 
ference between the current and the previous 
frames to generate region information repre- 

20 senting the detected processing region; 

feature point selection means for selecting a 
number of pixels from the pixels contained in 
the detected processing region as feature 
points based on the region information; 

25 first motion vector detection means for detect- 

ing a first set of motion vectors between the 
current and the previous frames, each of the 
first set of motion vectors representing a 
motion for each of the selected pixels; 

30 second motion vector detection means for pro- 

ducing a second set of motion vectors for all of 
the pixels contained in the current frame by 
using the first set of motion vectors; 
motion compensation means for assigning the 

35 value of each of the pixels in the previous 

frame, said each of the pixels corresponding to 
one of the pixels in the current frame through 
one of the second set of motion vectors, as the 
value of said one of the pixels in the current 

40 frame, to thereby determine the predicted cur- 

rent frame; 

means for generating a first frame difference 
signal by subtracting the predicted current 
frame from the current frame on a pixel-by-pixel 
45 basis; 

means for encoding the first frame difference 
signal by using a discrete cosine transform and 
quantization circuit; 

means for decoding the encoded differential 
so pixel values to thereby provide a reconstructed 

frame difference signal; 
means for providing a reconstructed current 
frame signal as the previous frame signal by 
combining the reconstructed frame difference 
55 signal and the predicted current frame signal; 

and 

means for statistically coding the first frame dif- 
ference signal, the first set of motion vectors 
and the region information. 
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The apparatus as recited in claim 4, wherein said 
region detection means includes: 

means for calculating the difference between 
the current and the previous frames on a pixel- s 
by-pixel basis to generate a second frame dif- 
ference signal wherein the second frame differ- 
ence signal includes NxM blocks, each block 
having PxQ pixel difference values and N, M, P 
and Q are positive integers; io 
means for absolutizing the second frame differ- 
ence signal to generate an absolutized frame 
difference signal; 

means for comparing the absolutized frame dif- 
ference signal with a predetermined value to is 
convert the absolutized frame difference signal 
into a converted frame difference signal, 
wherein, when a pixel difference value of the 
absolutized frame difference signal is smaller 
than the predetermined value, "0" is assigned 20 
as the pixel difference value and, otherwise, "1 " 
is assigned as the pixel difference value; 
means for counting the number of "l^'s con- 
tained in each block of the converted frame dif- 
ference signal; 25 
means for comparing the counted number for 
each block with a predetermined number to 
generate the region information having 
sequentially arranged NxM block representa- 
tive values, wherein, when the counted number 30 
for a block is smaller than the predetermined 
value, "0" is assigned as a block representative 
value of the region information and, otherwise, 
"1" is assigned as the block representative 
value. 35 

The apparatus as recited in claim 5, wherein the 
second set of motion vectors includes a third set of 
motion vectors for all of the pixels contained in the 
processing region of the current frame, which is 40 
obtained based on the first set of motion vectors, 
and a fourth set of motion vector for the pixels lying 
outside the processing region of the current frame, 
which is made to consist of zero vectors. 

45 
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FIG. 5 A 
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(54) Apparatus for encoding a video signal using feature point based motion estimation 



(57) A motion-compensated video signal encoder 
has a circuit for determining a predicted current frame 
based on a current frame and a previous frame of a dig- 
ital video signal. The circuit includes a region detection 
circuit for detecting a processing region encompassing 
a moving object from the previous frame based on a dif- 
ference between the current and the previous frames to 
generate region information representing the detected 
processing region. Therefore, a number of pixels is 
selected from the pixels contained in the detected 



processing region as feature points based on the region 
information. A first set of motion vectors between the 
current and the previous frames, each of the first set of 
motion vectors representing a motion for each of the 
selected pixels is then detected. The first set of motion 
vectors is used for predicting the predicted current 
frame and transmitted as a set of motion vectors of the 
video signal together with the region information. 
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